-
-
Notifications
You must be signed in to change notification settings - Fork 530
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[statspro] Bootstrap database statistics once on startup #8036
Conversation
@max-hoffman DOLT
|
279044b
to
2fb4d54
Compare
@max-hoffman DOLT
|
@coffeegoddd DOLT
|
@max-hoffman DOLT
|
@max-hoffman DOLT
|
@max-hoffman DOLT
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a good feature but I'm not sure about the default. This will incur a large startup cost for bigger DBs.
@@ -134,6 +168,9 @@ func (p *Provider) RefreshTableStats(ctx *sql.Context, table sql.Table, db strin | |||
// branchQualifiedDatabase returns a branch qualified database. If the database | |||
// is already branch suffixed no duplication is applied. | |||
func (p *Provider) branchQualifiedDatabase(db, branch string) string { | |||
if branch == "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably better to do this at the call site
I benchmarked the startup cost for this, and it seems like a similar penalty to rebuilding a journal index. Testing on a 50 million row database, startup without stats is 1.5 minutes, with stats after bootstrapping is about 3 minutes, first bootstrap is 20 minutes. |
@max-hoffman DOLT
|
@max-hoffman DOLT
|
Load database statistics once on sql engine startup. If auto refresh is enabled, bootstrap is not performed. Behavior is on by default and can be turned off:
(calling the command above with non-empty tables will still bootstrap statistics once)
This includes a small change to the way we encode column types for stats. We previously split using a comma
","
, but enums and others can include commas so we use a line break now"/n"
. Old versions of stats will fail to load with the newer version.